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Abstract 

We show that nonparametric regression is asymptotically equiva- 
lent in Le Cam's sense with a sequence of Gaussian white noise exper- 
iments as the number of observations tends to infinity. We propose a 
general constructive framework based on approximation spaces, which 
permits to achieve asymptotic equivalence even in the cases of multi- 
variate and random design. 
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1 Introduction 

Nonparametric regression is the model most often encountered in nonparametric 
statistics because of its widespread applications. On the other hand, for theoretical 
investigations the Gaussian white noise or sequence space model is often preferred 
since it exhibits nice mathematical properties. The common wisdom that statistical 
decisions in the two models show the same asymptotic behaviour has been formal- 
ized and proved for the first time by Brown and Low (1996) in the one-dimensional 
case, using Le Cam's concept of equivalence of statistical experiments. 

In this paper we propose a unifying framework for establishing global asymp- 
totic equivalence between Gaussian nonparametric regression and white noise exper- 
iments based on constructive transitions with only minimal randomisations. This 
framework not only allows to give concise proofs of known results, but extends 
the asymptotic equivalence to the multivariate and random design situation. The 
multivariate result has often been alluded to, though it has never been proved, 
see e.g. Hoffmann and Lepski (2002). While Brown and Zhang (1998) remark that 
the regression and white noise experiments are not asymptotically equivalent for 
equidistant design on [0, l] d and Sobolev classes of regularity s $J d/2, the so far 
only positive result by Carter (2006) states asymptotic equivalence for equidistant 
design in dimensions d = 2 and d — 3 when s > d/2. The difficulty in extend- 
ing results to higher dimensions is that we have to go beyond piecewise constant 
or linear approximations. For the dynamic model of ergodic diffusions Dalalyan 
and ReiB (2006) have established multidimensional asymptotic equivalence with a 
white noise model. For the case of univariate nonparametric regression, but with 
non-Gaussian errors we refer to Grama and Nussbaum (1998). 

In Section [21 the concept of isometric approximation spaces is introduced and 
applied to local constant and Fourier approximations. The latter yields an easy 
proof for asymptotic equivalence in any dimension d for periodic Sobolev classes 
of regularity s > d/2 and extends scalar results by Rohde (2004). A more flexible 
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framework is obtained using isomorphic approximations spaces in Section [3] As a 
main application, a constructive asymptotic equivalence result is established on the 
basis of wavelet multiresolution analyses, which provides equivalence results also 
for non-periodic function classes. Connections to asymptotic studies by Donoho 
and Johnstone (1999) and Johnstone and Silverman (2004) for wavelet estimators 
are discussed. The case of a random design, uniform on a d-dimensional cube, is 
treated in Section This setting is much more involved, but can also be cast in 
the isomorphic framework. The construction is based on a two-level procedure, 
generalizing an idea by Brown, Cai, Low, and Zhang (2002). Fine approximation 
and symmetry properties of the Fourier basis yield the main result that also in the 
case of random design asymptotic equivalence holds for Sobolev regularities s > d/2 
and any dimension d ^ 1. 

2 Isometric approximation 

2.1 General theory 

We write Jf 2 (^) := {/ : 9 -> K | ||/||| 2 := J\f\ 2 < oo} with I = R or I = 
C and L 2 (S>) for the Hilbert space of equivalence classes with respect to ||»||l2. 
Although the observations are real-valued, we shall use complex-valued functions 
for simplicity when treating Fourier approximations. 

2.1 Definition. Let be the regression experiment obtained from observing 

Yi = f(xi) + cr£i, i=l,...,n, 

/orjieNJ^CM^Rm some class C «5f 2 (^), fixed design points i, e ^ 
and for independent random variables £j <~ ^(0, 1). 

Suppose we are given an n-dimensional space S n C Jz? 2 (f^) and a linear mapping 
D n : J>f 2 (S>) — > K" with the following isometric property on S n 

Vg n e S n : \\g n \\L 2 = \\9n\\n ■= n~ 1/2 \D n g n \^. (2.1) 
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By (•,•)« we denote the scalar product associated with ||*|| n . Usually, D n g = 
(fl , (a ; <))i<*^« Wln ^ e the point evaluation at the n design points in which case 
llslln = n Y^i=i\9( x i)\ 2 1S J us ^ the empirical norm. Let us further introduce the 
linear operator 

J n : % 2 {9) -» S n , J n g := (DnlsJ-^gixi), . . .,g(x n )) T . 

For D n g = (g(xt))i^ n we have J n = {D n \ Sn )~ l D n and is the ||.||„-orthogonal 
projection onto S n such that .y n g is the unique element of S n interpolating g at 
the design points (#,). 

To state the first results, we refer to Le Cam and Lo Yang (2000) for the 
notion of equivalence between experiments and of the Le Cam distance between 
two experiments E and G, which for the parameter class 3? will be denoted by 
Ajr(E, G). The Gaussian law on a Hilbert space H with mean vector [i G H and 
covariance operator £ : H — ► H will be denoted by S). 

The regression experiment E^ can be transformed to a functional Gaussian shift 
experiment by applying the isometry (D^lg^) -1 to Y = (1^) £ R n : 

Z := (D n \ Sn )- l Y = J n f + S n , (2.2) 

V 71 

where ( := y/ri(D n \g n )~ 1 £ ~ ^/V(0,lds n ) is a Gaussian white noise in S„ because 
for g n , h„ S S„ 

E[((,g n )L2 (C,h n ) L 2] = n^ 1 E[(e,D n g n ) Kn (e,D n h n ) Kn ] = (g n ,h n ) n = (g n ,h n ) L 2. 

By adding completely uninformative observations on the orthogonal comple- 
ment of S n in L 2 (£F), the observation of Z in (|2.2(1 is equivalent to observing 

with ((/?, (} L 2 ~ ^(0, Hvll^)- In differential notation we have thus established the 
following equivalence. 
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2.2 Proposition. Let be the Gaussian white noise experiment in L 2 (S>) given 
by observing 

dYix) = Jnf(x) dx + -^=dB(x), xef, 
V n 

where f € and dB is a Gaussian white noise in L 2 {S>). Then the regression 
experiment E^ is statistically equivalent to F„ for any functional class . 

We are coming to the first main result. 

2.3 Definition. Let G„ be the Gaussian white noise experiment given by observing 

dY(x) = f(x)dx+ -^=dB(x), xeS), 
where f £ and dB is a Gaussian white noise in L 2 (2$). 

2.4 Theorem. The Le Cam distance between E^ and G^ for the class is 
bounded by 

A jr a(EtG^)<l-2*(-^ sup Wf-SnfWv), 
where $ denotes the standard Gaussian cumulative distribution function. 

2.5 Remark. Note that ||/-A/||| 2 = \\f - P n f\\ 2 L 2 + \\Pnf ~ <?nf\\\* holds where 
P n is the L 2 -orthogonal projection onto S n . This means that the bound on the Le 
Cam distance is always larger than the same expression involving the classical bias 
estimate supj g ^-d ||/ — -Pn/IU 2 - Because of <&(0) = 1/2 Proposition yields the 
rate estimate 

A^ d (EtG^)<a-V/ 2 sup \\f-S n f\\». 

fe^ d 

Here and in the sequel A < B means A ^ cB with a constant c > 0, independent 
of the other parameters involved, and A ~ B is short for A < B and B < A. 

Proof. Since E d and F^ are equivalent, it suffices to establish the bound for 
Ajrd(F n ,G n ). The two latter experiments are realized on the same sample space. 
Therefore the Le Cam distance is bounded by the maximal total variation distance 
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over the class jF d (Nussbaum 1996, Prop. 2.2). For Gaussian white noise the total 
variation distance is given by 1 — 2$(— -^||/— ^ n f\\h 2 ) (Carter 2006, Section 3.2), 
and the result follows. □ 



2.2 Piecewise constant approximation 

The original results of Brown and Low (1996) for equidistant design on & = (0, 1] fit 
into the proposed isometric framework. For design points Xi ~ i/n, i = 1, . . . , n, we 
consider the n-dimensional space S n of piecewise constant, left-continuous functions 
on (0, 1] with possible jumps at i/n, i = 1, . . . , n — 1. Using D n g — (g(i/n))i^i<^ n 
we obtain for g n G S„ 



9n\\'n = ~ y2\g„(i/n)\ 2 = V / \9n(u)\ 2 du = \\g n \\ 



such that D n has the isometric property. To infer asymptotic equivalence by Propo- 
sition l2~H we have to ensure that ||/ — J^u/Wl 2 — o{rT 1 / 2 ) uniformly over all / in 
some functional class & d . Considering the Holder class of regularity a e (0, 1] 

& H (a, R) := {/ e C a ([0, 1]) sup|/(aO - f(y)\/\x - y\ a < r\, 

we obtain for / E &h{ol % R) 

r i 1 n 



\f-^ n f\\h = Yl / 1/0*0 -iW? 



ln)\ z dx 

i=1 J(i-l)/r, ' 
pi/n 



^R 2 ) \x-i/n\ 2a dx 



{i-l)/n 

= R 2 (2a + l)- 1 n~ 2a . 

Consequently, asymptotic equivalence between and holds for any Holder class 
J?ij(a,i?) with a > 1/2 and R > arbitrary. The approximation property of the 
Haar wavelet yields even asymptotic regularity for L 2 -Sobolev classes of regularity 
a > 1/2. 

For nonuniform design ^ x\ < ■ ■ ■ < x n ^ 1 consider the same setting as 
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before, in particular D n g = (g(i/n))i ^ (g(xi))i. We obtain for / £ J£jy(a,i2): 



| x — Xi I a dx 

)/n 



II/- A/Hi, = £ / \f(x)-f( Xi )\ 2 dx^R 2 J2 \ 

i=1 J(i-l)/n i=iA<-l 

< E 2 n _1 ^ fn" 1 + \ Xi - ijn\\ a ^ 2R 2 n- 2a + 2R 2 n - 1 Y^l^i ~ */ 



By Theorem 12.41 we have obtained the following result. 



2.6 Theorem. On the Holder class ^h(q:,R) the Le Cam distance between non- 

(n) (n) 

parametric regression with design < x\ < ■ ■ ■ < x n ^ 1 and the white noise 
experiment satisfies 

n 



1/2 



Consequently, asymptotic equivalence holds whenever a £ (1/2,1] and the design 
satisfies lim^oo ^^ =1 1^^"'' — i/n\ 2a — 0, e.g. j/maxi|a;^ — i/n\ — o(n~ 1 /' 2Q )). 



2.7 Remark. This approach does not permit to establish global equivalence for 
the random design case in Section ^] because the standard deviations of the order 
statistics Xrj) decrease only with rate n~ x / 2 . Treating the random design like being 
equidistant yields nevertheless for estimation purposes nearly optimal asymptotic 
L? -risk when a> 1/2 (Cai and Brown 1999). 



2.3 Fourier series approximation 

In the case of S> = [0, l] d , d ^ 1, and of an equidistant design (fc/m)fe g { 1; m }d 
with m = n 1 ' d £ N and odd, the Fourier system (l := y/— T) 



<pt(x):=exp{2irL(x,£)), I = (4, . . . , i d ), KU < 
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is not only L 2 -orthonormal, but also with respect to (•,•)« for D n g := (g(k/m))k- 



(<pt,<pe>) n = — ^ <pe(k/m)<pe>(k/m) 
n L — ' 
fce{i,...,m} d 

m d 

m d Il ex P ( 2ni - k i(^ ~^)A 

fei ....,kd — l i=l 
d - m 

n(-E ex p( w ^-^)A 



, m 

2—1 K— 1 



1, if m|(4-^) for alH, 
0, otherwise. 



(2.3) 



Consequently, the space of trigonometric polynomials S n := span(<^£, l^oo ^ ^j^-) 
satisfies the isometric property (|2.1() . 

The d-dimensional periodic Sobolev class of regularity s and radius R on [0, l] d 
is given by 

J?i )Per ( S) i?) := {/ G £ 2 ([0, | J2 W-K/. ^)! 2 < 

Due to the strong cancellation property l|2.3[l of the scalar product (•, «) n we derive 
explicitly 

K|«,<(m-l)/2 fceZ d 

In view of Remark 12.51 we first bound the classical bias: 

sup \\f-P n f\\h= sup ]T |(/,^)| 2 =i? 2 (^± 1X " 

For s > d/2 we obtain, using the Cauchy-Schwarz inequality, 

SUp \\Pnf - AJ\\h 

sup ( X! (/>^+fem>) 

^( sup £ £ K + fcm|^(/, mfem > 2 )x 

f£&*(s,R) \£\ ao < i ( m - 1 )/2keZ d \{0} 
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( sup 2_, K + km\ 



-2s 



k& d \{0} 



R 2 mT 2s sup V \k + £/r 

\iU<lm-i)/2 kezd 



^ R 2 m- 2s (2 2s {2 d - 1) + J2 l fc lc 



2 s 



fcez d \{o} 



Hence, using Theorem 12.41 we have proved the following result, which extends the 
scalar results by Brown and Low (1996) and more specifically Rohde (2004) to any 
dimension d ^ 1. 

2.8 Theorem. For d- dimensional periodic Sobolev classes per (s, -R) with regu- 
larity s > d/2 and equidistant design on the cube [0, 1] the nonparametric regression 
experiment and the Gaussian shift experiment are asymptotically equivalent 
as n — > oo . The Le Cam distance satisfies 



3 Isomorphic approximation 

3.1 General theory 

We extend the preceding framework by merely requiring an isomorphic property. 
Since it will suffice for the subsequent applications, we specialize here immediately 
to D n g = {g{x\), . . . ,g(x n ))- Let S n C J<? 2 (.S?), dimSVi = n, have the property 

V5„ € S n ■ g n {x\) = ■■■ = g n (x n ) = g n = 0. (3.1) 

Let 

^ n n 

(f,9)n ■= ~'Y\f{xi)g{xi), resp. (v, g) n := - V v % g(x % ), /, g G -Sf 2 , v 6 M™ , 
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and ||<7||„ = (g,g)n- In this notation Equation l|3.1[) is equivalent with the isomorphy 
of the norms ||»||„ and ||»||l 2 on S n : 

3A n ,B n >0Vg n €S n : A n \\g n \\ L 2 < \\g n \\ n < B n \\g n \\ L 2. (3.2) 

We choose any L 2 -orthonormal basis ((Pjji^j^n of S n and introduce the linear 
mappings 

n 

n„, j n ■ 5? 2 {@) s n , n„ 5 : = <Pj)nPj, A.9 := u n \^n n g. 

3=1 

Observe the following properties: for g„,h n 6 S n we have (U n gn,h n ) — (g n ,h n ) n 
and ||IT n |sJ| ^ B n , ||(n„|s„)~ 1 || ^ A" 1 ; J n is a projection onto S n and J n g 
interpolates g at the design points (x^); Tl n and J^ ra are independent of the choice 
of the basis (<Pj)- 

The regression experiment can be transformed to a functional Gaussian shift 
by expanding the observations (Yi) in the basis ((fj): 

n 

Z x := Y,(Y> <Pi)n<Pj = n„/ + -^(n„| s J 1/2 C G S n , (3.3) 

3=1 Vn 

with Gaussian white noise ( := (TL n \s n )~ 1 ^ 2 (-\/nYl'j=i( s ji fj)nfj) ~ .yf (0,ld,g n ) 
because 

£[(C,ffn)(C,M = ((n„| S J" 1/2 5n,(n„|s„)^ 1/2 ^)n = (.9n,M, <7n, ^ G S n . 

By applying (II^Is^)" 1 / 2 and (U n \s n )~~ 1 , respectively, we conclude that the regres- 
sion experiment ¥, n is also equivalent to observing 

z 2 = (nnk)- 1/! % = (n„| s „) 1/2 ^„/ + g s nj (3.4) 

v n 

z 3 = (n^j- 1 ^ = s n f + ^=(n„| s „r 1/2 C g s n (3.5) 

with C ~ ^(O.IdsJ. 

3.1 Theorem. The regression experiment E d is equivalent with each of the exper- 
iments given by observing Z\ in (|3.3|) . Z2 in (|3.4|) and Z3 in 13. 5f) . respectively. 
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The Le Cam distance between E n and G n for the class ^ d satisfies the bounds 

T d m d \ S 1 _ 1<hf _ ll-f TT I 1 / 2 



A^(E*,G*)<l-2$ — sup ||/-n„|^A/||L2 , (3.6) 

A^OE&G*) < 1 - 2$( - ^ sup ||/ - y n f\\ L A + V^UUnlsJ- 1 - Id s J HS , 

(3.7) 

where \\*\\hs denotes the Hilbert- Schmidt norm of an operator. 

Proof. It remains to prove the second part. The first bound i|3.6[) follows from 
the equivalence with observing Z2 by the same arguments as for Theorem 12.41 To 
establish l|3.7|l , we use the fact that the Hellinger distance between two multivariate 
normal distributions with the same mean satisfies 

H 2 (N(fi,aE),N(iJ,,aId K n)) < 2||E - Id R »||^ s , Eel" x ",a>0, (3.8) 

which follows e.g. from (Brown, Cai, Low, and Zhang 2002, Lemma 3) via the 
diagonalisation S = O t diag(Ai, . . . , A„)0 and the property ||S — Id^Hf^s = 
\\0(E — IdE")© 1 ^!!^^ — SlLi ^H- Therefore the total variation distance between 
the laws of Z 3 and Z4 := J? n f + -^=£ is bounded by 

\\££{Z Z ) -J?(Z 4 )\\ TV H(J?(Z 3 ),J?(Z 4 )) < -v/^IK^IsJ" 1 - HsJhs. 

The by now standard arguments yield with obvious notation 

A^(E£,G£) =A^(Z 3 ,Gi) < A, wd (Z 4l G d n ) + A^(Z 4l Z 3 ) 

<i-2$(-^ sup \\f-Snf\\iA +V2||(n„|s„)- 1 -id s j| ffS , 

V ZO~ f£^<i ' 

as asserted. □ 
3.2 Linear spline approximation 

Let us briefly expose how the approach by Carter (2006) fits into the iso- 
morphic framework. As in Section 12.31 we consider equidistant design points 
{k/m) ke {i r . m } d w ith rn = n 1 ^ 6 N and periodic functions on the unit cube 
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*2> = [0, l] d . The space S n is spanned by the periodized and tensorized linear B- 
splines 

d 

b k (x) = bk(xi, . . . ,x d ) = Y\_ b(mx r - k r mod 1), b := 1[_ 1/2,1/2] * l[-i/2,i/2]: 

r=l 

indexed by k € {1, . . . , m} d . For a g (1,2] it is well known (cf. Dc Boor (2001)) 
that interpolation on S n for the periodic Holder class 

H, P eMR) ■= {/ e C a (R d ) I / Z d -periodic, suplV/^-V/^l/l^-y]"- 1 ^ R} 
satisfies the estimate 

sup \\f-y n f\\mio.i]-)<Rn- a/d . (3.9) 

/e^, pcr (a,fl) 

On the other hand, we hcLve for g n £ Sn 

2 

\\9n\\h = ^2 9n(k/m)b k = (b k ,b e ) L 2g n (k/m)g n (£/m) 
ke{l,...,m} d k,ee{i,...,m} d 

with (b k ,b e ) L 2 = for Ifc-^loo > land (b k ,b e ) L 2 = 4#{ r:fe "=^}/(6 d n) for \k-£\oo < 

I. Since ^2i(b k ,bi) 1,2 = (b k ,l) L 2 = n~ l , a weighted Cauchy-Schwarz inequality 
yields 

\\9n\\ 2 L 2 ^ rC 1 ^ g n {kjmf = (g n ,g n )n = (n„5„, g„) L 2 

k£{l,...,m} d 

and we conclude, using the ordering of symmetric operators, that (Il„ \g n ) _1 Ids„ . 

Adding to the observation Z3 in 1|3.5[) independent Gaussian noise r\ ~ 
._/K(0, — (Ids n — (Tlnlsn)^ 1 )), we infer that the regression experiment E^ is more 
informative than observing 

Z 5 := Z 3 + rj = J n f + 4=C G 5n (3-10) 

with Gaussian white noise ( := (II„|5 ii ) _1 / 2 C + n 1 / 2 ^ 1 ^ ~ </f(0, Ids„). This ran- 
domization together with estimate (|3.9|l shows that the regression experiment E^ 
is asymptotically at least as informative as the Gaussian experiment on Holder 
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classes ^ d er (a,R) with a > d/2 and d e {1,2,3}. Together with an (easier) ran- 
domization in the other direction and a more sophisticated boundary treatment 
for non-periodic function classes this reproduces the proof by Carter (2006) for 
asymptotic equivalence of regression and white noise experiments in dimensions 2 
and 3. For B-splincs of higher order the interpolation property bk(i/m) — 5k,i gets 
lost and (n^s^) -1 < Ids n cannot be shown such that a more refined analysis is 
needed. This will be accomplished in the next section for a similar approach using 
compactly supported wavelets. 

3.3 Wavelet multiresolution analysis 

The construction. Let us assume an equidistant dyadic design (k2~ : >) ke ^ 1 2 j} d 
with n = 2* points for some j £ N and @ = [0, l] d . We consider a wavelet mul- 
tiresolution analysis (Vj)j^o on L 2 ([0, l] d ) obtained by periodisation and tensor 
products. Let (p be a standard orthonormal scaling function of an r-regular mul- 
tiresolution analysis for L 2 (K), that is {(p{» + k))kez forms an orthonormal system 
in L 2 (R) and satisfies / (p — 1 as well as the polynomial exactness condition that 
Sfcez k q {p{x—k) — x q is a polynomial of maximal degree q— 1 for all q = 0, . . . , R— 1 
(Cohen 2000, Thm. 16.1). We suppose that (p has compact support in [— S + 1, S], 
like in Daubechies's construction, so that the functions ipjk '■ [0, 1] — ► R, j ^ 1, 



<p jk (x u ...,x d ) := 23d/2 H^ j Xi-ki + 2 j mi) 

m£Z d i=l 

are well defined and form an orthonormal system in L 2 ([0, l] d ) (Wojtaszczyk 1997, 
Prop. 2.21). We set Sy* := Vj := span{<^ jfe | k e {1, . . . , 2-?} d }. 

Periodic approximation. Polynomial exactness and continuity of Cp imply for 
q = 0, . . . , R — 1 and any x € K (Sweldens and Piessens 1993) 



fee {l,...,2^} d , with 
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This identity is fundamental for our purposes because it implies for Z d -periodic 
functions h : M. d — > M that coincide with a polynomial p of maximal degree R—l 
on nti[ 2 ^(^ - 5 - 1), 2-i (ki + S)}: 



(h, <p jk ) L 2 = Y V d ' 2 / h(x) TT <p(2?(xi + im) - ki) dx 

rneZ* "W 

r. d 

= 2 jd ' 2 / h(x) FT <p(2?Xi - ki) dx 

/d 
p(2-'(x + k))T\<p(xi)dx 

d 

= 2 -id/2 J- p(2^(m + fc))TI<?K) 

m£Z d *=1 

= 2 -jd/2 ^ h(2- j m)<p jk (2- j m) 

me{l,...,2i} d 

= n 1/2 (h,if jk )n, 

where we identified n = 2^ d . For any Z d -pcriodic function g E Hg ([0, l] d ) with 
s G (d/2,R) this local polynomial reproduction property implies by standard, but 
sophisticated arguments for direct estimates (Cohen 2000, Thm. 30.6) 

\\g-U n g\\ L2 <2-'-%||*. =n-°l d \\g\\ H s, (3.11) 

where ||«||/r« denotes the standard L 2 -Sobolev norm of regularity s on [0, l] d . We 
split the bias term and obtain by functional calculus 

11/ - n„|- n 1/2 n„/|| i2 < ||/ - u n f\\ L2 + \\u n f - n„|- n 1/2 n„/|| i2 
= 11/ - n„/|| L2 + ||/i(n„| s J(id-n„)n n /|| i2 

with h : R + -> R, h(x) := l/(x + x 1 / 2 ) = (x^ 1 / 2 - 1)/(1 - x). Since h decreases 
monotonically and h{x) < x~ 1//2 , we have |j/i(II„|5 n )|| L 2^ L 2 < \nln wrtri tne 
smallest eigenvalue X m in of II„ | g„ ■ 
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n„|s„ satisfies for n — 2 Jci ^25 — 1 the following scaling property: 

u£{l,...,2i} d 

d 

= E E Y\(<p{(v-k + ym) a )v{(v-i + y m ) a )) 

meZ d vE{l,—,2i} d a=l 
d 

a=l beZ 

Since Cp has compact support, the series is just a finite sum and Il„ has a bounded 
Toeplitz matrix representation in terms of (fjk)- Using Fourier multipliers it follows 
that (U n g n ,g n ) L 2 ^ A|||.g n ||| 2 , g n <E S n , with := inf„ e[0l 27r] lEfcez <fi(k)e'' ku \ d , 
independently of n. Due to the compact support of (p, we have A& > iff the 
trigonometric polynomial Xfcez <p{k)e'' ku , u G [0, 27r], does not vanish. It is well 
known (Sweldens and Piessens 1993, Lemma 3) that this is exactly the condition 
to ensure that the multiresolution analysis is also generated by an interpolating 
scaling function. It can be checked for standard Daubechies scaling functions, e.g. 
by showing |<^(fco)| > ^2k'^k I^COI f° r some ko G Moreover, gaining more 
flexibility by considering the shifted spaces based on (p T = (p(» — r), t G (0, 1), a 
wavelet multiresolution analysis will almost always satisfy A^ T > for some value 
of r, cf. Sweldens and Piessens (1993) and the references therein. 
We arrive at 

||/-n„| Sn 1/2 n n /IU 2 < ||/-n„/|| i2 +^ 1/2 ||(id-n„)n n /|| L2 . 

Because of ||n n /||i^ \\f\\ H ' (Cohen 2000, Thm. 30.7) we derive from (|3~TT|) the 
uniform estimate over / G ^s, per (s, R) 

||/-n„| s y 2 n„/|| i2 < \\f -n n f\\ L 2 + A^ 1/2 \\(id-u n )u n f\\ L 2 < Rn- S ^ d . 

Hence, the estimate in (|3.6f) yields asymptotic equivalence between the regression 
and the white noise experiment for any class per (s, R) with s > d/2. 

This result provides another way for constructing explicitly the transformation 
between the regression and the white noise setting. It has no more theoretical 
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implications than the Fourier basis approach, but it paves the way for proving 
asymptotic equivalence for non-periodic function classes. 

Non-periodic approximation. Since every ipjk has support length 2~ J (2S — 1), 
only those functions ipjk with k r G {1, . . . , S — 2} U {2 J — S + 1, . . . , 2 J } for some 
r = l,...,d cross the boundary and are periodized at all. Therefore, the same 
derivation using only interior scaling functions shows that the regression experiment 
E^ for the general Sobolev function class 



is asymptotically more informative than the restricted white noise experiment G n 
given by observing 

dYix) = f(x)dx+-^=dB(x), xe [S n , 1 - 5 n ] d with S n := (2S- lln" 1 ^. (3.12) 

Although G^ is a priori less informative than G^, we may use classical extrapolation, 
e.g. the Taylor polynomial of order [s\ around y G [6 n , 1 — <5„] d . We define 
at the points x G [0, l] d \ [S n , 1 — 5„] d the extrapolation /(x) = Tj*{x) for a 
point y x G [<5„, 1 — <J n ] d with |y x — x|oo ^ 2S n , selected in a measurable way, and 
f(x) = f(x) otherwise. We thereby achieve 



obtained a result for function classes without periodicity condition. 

3.2 Theorem. For general d- dimensional Sobolev classes ^g(s,R) with regularity 
s > d/2 and equidistant design on the cube [0, l] d the nonparametric regression 
experiment E^ and the Gaussian white noise experiment G d are asymptotically 



&§{s,R) :={/ G H"([0,l] d )\ \\f\\ H . ^R} 




such that 
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equivalent as n — ► oo. TTie £e Cam distance satisfies 



Discussion. The property that a wavelet estimator based on an equidistant re- 
gression model and a corresponding estimator based on a white noise model are 
asymptotically close is well known, see e.g. Donoho and Johnstone (1999) and 
Johnstone and Silverman (2004). Interestingly, both papers show identical asymp- 
totics of the L 2 -risk for standard estimators uniformly over balls in Besov spaces 
Bp q ([Q, 1]) with s > 1/p or s = p = 1. Since B^ embeds into the Sobolev space 
H a for s > a and s — 1/p > a — 1/2, Theorem 13.21 provides more generally asymp- 
totic equivalence for Besov classes with s > 1/p and p < 2. The counterexample in 
Brown and Low (1996) shows, however, that for s ^ 1/2 and all p G [1, oo], asymp- 
totic equivalence breaks down. Similarly, if ip £ B\ 1 is a function with support 
in (0,1) and HV'IIl 2 = 1; then ip n (%) '•= ip{nx) has support in (0, 1/n), L 2 -norm 
IIVvJIl 2 = n^ 1 / 2 and Besov norm HV'nllfl 1 ~ 1< Hence, testing the signal / = 
versus / = ip n has nontrivial power in the white noise model , while both signals 
generate exactly the same observations in the regression model E^. We conclude 
that and are not asymptotically equivalent on Besov classes with s = 1, 
p = 1. An intriguing example for the important class of bounded variation functions 
is given by i/vi^) = \/21[i/4n,3/4n] i x )- Asymptotic equivalence between Gaussian 
regression and white noise is indeed an L 2 -theory and we cannot gain by measuring 
smoothness in an L p -sense, p ^ 2. 

Let us also mention that the (asymptotically negligible) loss in information 
due to neglecting boundary coefficients in the construction seems unavoidable. The 
wavelets on an interval (Cohen, Daubechies, and Vial 1993) use nonorthogonal 
boundary corrections and can therefore not be used, while the coiflet approach by 
Johnstone and Silverman (2004) also involves some information loss at the bound- 
ary, cf. their remark on dimensions before Proposition 2. 
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4 Random design 
4.1 The general idea 

Denote by U([Q, l} d ) the uniform distribution on the cube @ = [0, l] d . 

4.1 Definition. Let K d t r be the compound experiment obtained from observing 
independent random design points Xi ~ U([0, l] d ), i — 1, . . . ,n, and the regression 

Y t = f(Xi) + aei, i = l,...,n, 

for n € N and f : [0, l] d -> M in some class & d C J^ 2 ([0, and uraift i.i.d. 
random variables Si ~ ^(0, 1), independent of the design. 

We place ourselves into the isomorphic setting, that is we are given an 
i 2 ([0, l] d )-orthonormal basis ( ( Pj)j^i and we set S n = span(</?i, . . . , </2„). For the 
moment we merely assume that S n is chosen to satisfy the isomorphic condition 
(13.11) given the random design points {Xi)i<^i^ n . Later, certain parts will rely on 
fine properties of the Fourier basis. Conditionally on the design the regression ex- 
periment is equivalent to observing 

n 

z x := 5^{y, = n„/ + -^(u n \ Sn ) 1,2 C g 5„ 

3=1 ^ 

with white noise £ ~ 2V(0, Idg n ). Let us briefly comment why the foregoing 
approaches using Z 2 in (|3.4() or Z3 in (|3.5() will not succeed here. For Z 2 = 
(H n \s n )~ 1 t 2 Zi we need to have || (n n |^ 2 - Id)^ n /|| i 2 and \\^nf - f\\L 2 of smaller 
order than n~ x l 2 . The second property can be ensured for Sobolev classes of reg- 
ularity s > d/2 as before. The first property however, will not hold. By empirical 
process theory, we have for g t ,g 2 £ S n approximately (n. n g 1 ,g 2 ) L ^ = (31,52)™ ~ 
(gi,g 2 )L 2 + n~ x / 2 J gig 2 dB° with a Brownian bridge B°. By the linearisation 
(1 + h) 1 / 2 — 1 ps /i/2 and taking expectation with respect to the random design, we 
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find 



E 



n „ 

| ((link) 1 / 2 - Id)A/||£ 2 ] ~ E[J2 \n~ 1/2 I W)VjdB° 

n „ 

~n" X E / HV»/l a > IK/I 



2 

L 2 ■ 



i'=l 

Hence, in the mean over the random design this term does not tend to zero. When 
considering Z 3 = (II n k) -1 ^i) we would need ||(n n k) -1 — IdsJIffs — > 0, compare 
Bound (|3.7|l . but the mean over this term is by the same approximations of order 
n. The main defect in these approaches is that we do not take advantage of the 
regularity of /. 

The new idea is based on a two-level procedure, which can be interpreted as 
a localisation approach, cf. Nussbaum (1996). We choose an intermediate level 
hq < n and split S n = S no + U" with the • „-orthogonal complement J7™ of S no 
in S n . On the low- frequency space S no we use the empirical orthogonal projection 
Pn Q Y of the data onto iS„ . This construction is analogous to Z$ in (|3.5(l and the 
heteroskedasticity in the noise term will become asymptotically negligible provided 
n = o(?i 1 / 2 ). 

On the high-frequency part {/" of S n we transform to a Gaussian shift with 
white noise, which is independent of the noise in S no , in the spirit of Z2 in (|3.4J) . In 
order to take advantage of the regularity of /, however, we do not use the standard 

— 1/2 

square root operator U n to whiten the noise, but the adjoint T* of an operator 
T : S n — > S n which has an upper triangular matrix representation in the basis (ipj) 
and satisfies TT* = (n„k) _1 (as in the Cholesky decomposition). Since T* is a 
unitary transformation of (n„k) -1 ' 2 , the noise part remains white. Due to the 
triangular structure, the signal coefficients (T*U n f,ipj) L 2 = (T _1 ^„/, (flj)L 2 do 
not involve the (usually large) coefficients (J^ n f, fk)L 2 f° r indices k smaller than 
j. Moreover, for the Fourier basis the other off-diagonal matrix entries of T _1 are 
centred and uncorrelated, while the deviations in the diagonal entries grow with the 
frequencies, but are exactly counter-balanced by the decay of the Fourier coefficients 
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for Sobolev function classes. Provided n — > oo, this high-frequency transformation 
will imply asymptotic equivalence. 

4.2 The main result 

Let us specify the transformation T concretely based on the Gram-Schmidt pro- 
cedure for orthonormalisation with respect to ||*|| n - For j < n denote by Pj, P™ : 
S n — » S n the L 2 -orthogonal and ||»||„-orthogonal projections onto Sj, respectively 
and set Pq := 0. We obtain an ||»||„-orthonormal basis (ipf) of S n via 



n rj 7 — 1 r J . 

<^-^ o - ii ■ J = l,---,n. 



Then <p" is in Sj and the ||»|j„-orthogonality tp™ _L„ Sj_i holds. Defining T : S n — > 
^ via T<£j := <p", we see that T satisfies (Tipj/ 7 tpj) L 2 = for j > j' and is an 
isometry between (S n , ||«||l 2 ) an d (S n , \\*\\n) such that II n |s n = (TT*) _1 . The noise 
terms ((e, ¥'")n)i<j<n ~ ^(0, n _1 ) are therefore independent and 

Using P|s„ P|s„ o = (nri|s„ ) _1 , wc introduce the rescalcd covariance operator 
X : 5„ — » 5„ via 

:= (n„|s„ ) _1 P„ .9n + ( Id s„ -Pn )g n , 9n & S n . 
The regression experiment is then transformed to observing 

no n 
j=l J="o+l 

= C / + T" 1 (Pn - PZ)f + n- 1/2 <7SV2 C e ^ 

with Gaussian white noise £ ~ iV(0, Idg„), conditional on the random design. 

4.2 Example. Let us consider the Haar basis. Write Ijh = [2~ J /c, 2~ J (k + 1)), 
N jk = #{* : X t G J jfc } and V jfc = 2^(1^^ - V J > 0, fc = 
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0, . . . , 2 J — 1. By construction the transformed basis function ipj'k ^ as support Ij k , is 
constant on Ij+i,2k, Ij+i,2k+i and satisfies (ip^ k , li jk ) n = 0, H^lln = 1- We infer 

1>?k = C 3 k(N-^l Ij 2k - N- 2 \ +1 l Ij 2k+1 ), C% = nN j+1 , 2k N j+lj2k+1 /N jk . 

This is exactly the application of our framework underlying previous one- 
dimensional constructions (Brown, Cai, Low, and Zhang 2002, Eq. (2.8)). Because 
here S n is for most design realisations not isomorphic, additional randomisations 
are needed. 

For the following general d-dimensional theorem we consider the construction 
(|4.1I) in terms of the Fourier basis functions <fij(x) — exp(27u(£(j), x}) with an 
enumeration £ : N — > 7L d of Z rf satisfying |^(i)|^2 ^ |^(j')|^2 for j ^ f (i.e. sorted in 
the order of magnitudes of the frequencies) . 

4.3 Theorem. For d- dimensional periodic Sobolev classes per (s, R) with reg- 
ularity s > d/2 the nonparametric regression experiment r with random design 
and the Gaussian shift experiment are asymptotically equivalent as uq, n — > oo 
and no = o(n 1 / 2 ). The Le Cam distance satisfies 

4.4 Remark. The asymptotically optimal choice of hq is given by no ~ n d /( 2s + d ) ; 
which yields a bound on the Le Cam distance of order n ( d ~ 2s )/i 2d + 4s ) . Note that 
this choice no ~ n d /( 2s + d ) corresponds exactly to the optimal dimension of the 
approximation spaces in nonparametric regression and is also used by Gaiffas (2005) 
for his two-level construction of optimal confidence bands. 

Proof. In order to bound the Le Cam distance for compound experiments, we use 
that for distributions K <E> P and K' ® P, defined on (O x f2',J^® by the 
measure P on J£" and the Markov kernels K, K' from to J^', the total variation 
distance can be calculated by conditioning: 

\\K®P-K'® P\\ TV {W) = j \\K(u, .) - K'{w, .)||tv(*o P ( dw )- 
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Therefore we can first work conditionally on the design and then take expectations 
for (Xj). Moreover, the white noise experiment G^ is equivalent to the compound 
experiment of G^ and the observation of the random design points because the 
latter is a trivial randomisation of G^. 

It is a remarkable property of the Fourier basis that S n is almost surely iso- 
morphic, cf. Theorem 1.1 in Bass and Grochenig (2004). In Proposition ^. 81 below 
we prove that the event 

fi? := {V.g G Sj : IIMU* ^ \\g\\ n < 2|| 5 || L2 } (4.2) 

for j log(j) = o(n) even satisfies P((f2™)^) — > with a convergence rate faster than 
any polynomial in n. This is much tighter with respect to the subspace dimension 
than what can be derived from Bass and Grochenig (2004). In order to establish 
asymptotic equivalence, it suffices therefore to estimate the total variation distances 
on the event Sl™ . 

By (|4.1|l , the regression experiment r is equivalent to observing Z r together 
with the design. Introducing 

Z r := P„/ + cm" 1 /^ eS„, (4.3) 

we shall prove in a moment that (with obvious notation) 

A# lper(s>R) (Z r , Z r ) < n-^no + a' 1 Rn^ 2 - s 1 d , (4.4) 

but then the assertion follows: Observing Z r is equivalent to observing 

dY(x) = P n f{x) + cm- 1/2 dB{x), x G [0, if, 

which has a total variation distance to the Gaussian shift GjJ of order cr _1 n 1 / 2 ||/ — 
■Pn/IU 2 ^5 (T ~ lnl ^ 2 ~ s ^ d \\f\\H s ■ Using the triangle inequality for the Le Cam 
distance between the intermediate experiments, we arrive at the bound for 

To obtain (|4.4|) . we take expectations over the design and split 
E[\\&(Z r ) - &{Z r )f TV l n n] < I + 11 + III 
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with the terms 

I := na~ 2 E [|| (P" o - P no )f\\ 2 L2 ln~ o ] (difference in mean on S no ), 
II := £;[||(n n |s„ ) _1 - lA Sno \\ 2 HS ln^ o ] (heteroskedasticity on Sno), 
III := n<r- 2 £[||(T- 1 (J£ - P™ ) - (P n - P no ))f\\hlnn o ] (difference in mean on S^ 2 ). 

Term I. Using the projection properties, we obtain on fi" : 

\\(PZ- P no)f\\h = ||P n " (H-P no )/||i 2 <4||P«(Id-P no )/||t 

Because of E[{<pfc,(pj)n{<Pk'><Pj)n] = for fc 7^ k', k,k' > j by Proposition 14.51 
below, an expansion in the basis (y") yields 

no 00 

£[||P™(Id-P„ )/|£]=£ E l(/,y fc )L 2 | 2 ^[l(y fe ,y")n| 2 ] 

j=l fe= no +l 
00 

= E K/.Vfc>L»| 2 ^[II^Vfc|ln]- 
/c— no + 1 

Proposition 14. 91 below yields ^[H-Pno^lln] ~ an< ^ nence 

oo 

/<-- 2 E ia^-)^i 2 fc<^ 2 -r 2s/ 1/ii^. 

k— no+1 

Term II. Using IKlIn^)- 1 ^^. < 4 on ft£ o , we find: 

£ [I I (n„ I Sno ) - 1 - Id s „ n 1 1 2 HS ln % } < B[| I (n„ | s „ ) - 1 1 1 L 2^ L 2 1 1 n„ | s „ o - Id s „ o 1 1 % s ln» o } 

<4£[||n n | s „ o -Id s „ ||| rs ] 

"0 

= 4 E S[K^>Vi')n-^'| 2 ] 
no „ 

<4n" 1 E /l^flrf 
For the Fourier basis we obtain // ^ 4n _1 ng. 

Term III. Let us write / = /o + A + /a with /„ = P n J, h = (P„ - P no )/, 
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f% = (Id— P n )f. Then the projection properties imply 

E [\\( T ~ l ( P n PZ) ( P n ^o))/lliaonj 

= ElWT- 1 /, + T-ip%f 2 - T"^™ (A + j 2 ) ~ /illislo-J 

< U)hf L2 + \\{P£ - P« )f 2 \\ 2 n + llOrll'lonJ 

< 3S[||/i||» + HAIIi, -2Re((T- 1 /i,/i>L0] + 3£[||/ 2 || 2 ] + 325[||i£/i|£ln» o ] 
-6i?[Re((/ 1 -r- 1 /i,/i>L 2 )] +311/211^ + 3f?[||i^/i||»l n - o ] 

=: IIh+III 2 +IIh. 

The term 1772 is easily bounded by ||/2||f,2 ;$ n ~ 2s ^ d ||/llfp- As in the estimate 
for term I, we obtain III?, < n _1 nj 2s ^ d ||/|||r». For 77ii we use E[(T~ 1 ipj, tpk) l 2 ] 
( ) . j ^ fc, by Proposition 14.51 below to conclude 

n 

^[ReU/x-r- 1 /!,/!)^)] = E K/.^^H^Wld-T- 1 )^-,^)^]. 

j=n + l 

Because of ||^-|| n = 1 for the Fourier basis we find 
By Proposition 14. 91 below, the bound 



E[Re{{h-T^h,h) L ,)] < £ K/^^III^l^ll^ £ ^|(/,^-) L2 | 2 

j=n + l j=no + l 

follows, which is of order n^n]^ 2s ^ d ||/|| 2 ^ 3 ■ Putting the estimates together, we have 
shown 

^ < CT - 2 («o" 2s/d H/ll^ + ^-^^ll/ll^ + "o^ /d |l/ll^) < ^- 2 -o" 2s/d |l/ll^ 
and in sum I + II + III < o~ 2 n^ 2s ^ d R 2 + nT 1 n 2 t uniformly over / S JPg per (s, R), 
which gives the asserted bound l|4.4|) . □ 

4.3 Technical results 

We gather results on fine properties of the Fourier basis (cpj) and its generated 
approximation spaces S n . The setting is as in the proof of Theorem 14.31 For the 
value of the next proposition notice that (ifk'jf^n = (T^ 1 ipk', l fik}L 2 - 



Asymptotic equivalence for nonparametric regression 



25 



4.5 Proposition. We have for indices k" , k' > k 1, k" ^ k' : 

E[(^ , ¥>2>„] = and E[{<py , ^)„^ fe »,^)J = 0. 

Proof. Since the randomness enters via PJ}_i in a very intricate way, we use a sym- 
metry argument. Specify := (Yi + i3) mod 1, i = 1, . . . , n, with ~ f([0, 

~ U([0, l] d ) all independent such that Xi ~ t/([0, i.i.d. Working conditionally 
on i?, we shall keep track on the dependence on ■& using brackets. We claim that for 
k' > k it holds that 

(<Pk>,rt)n[0\ = e^W-WMi^rtUQ], (4.5) 
which entails the result due to 

f e 2iri(£(k')-t(k),0) d fl _ o and /" e 2KL{(l(k')-l(k),ti)-{l(k")-l{k),1))) M = q 
J[0,l] d -Ao,l] d 

For m S Z d put 

1 n 1 n 

n ^ n ' 

The proof of (|4.5|l will be performed by induction from k < k to k, considering 
tupels (k', k), k' > k, and (fc', fc), k' > fc. Since £(1) = and (p™ = <fi = 1, we have 
for fc' > 1 and fe = 1 

i=i 

Writing c^. := |<^fc — P£_ 1 (pi e H^ 1 , the induction hypothesis implies 

Cfc a M = i-EKvfc,v7)l 2 M = c* a [o] 

and furthermore 

(<^,^>„[tf] = (p* - ErlJ^*,^)^))^] 

fe-1 

= C k (((pk>,<Pk)n - ^2(<Pk> ,<P?)n(<Pk,(p?)n)[ti] 
r=l 

fc-1 

= Cfc[tf] (^(fc -«fc) [<*] " E e 2 -<^ fc ')-^ fc )'") <^<tf)„[0](^,tff) n [0]) 

r=l 
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which proves the induction step and gives l|4.5[) . □ 

4.6 Proposition. Suppose g — J2\i\ e2 <L 7£e 27rt ^'*^ is a d- dimensional trigonomet- 
ric polynomial of degree L. Let A € (0, L~ l ] with 1/A £ N be given and define the 
cubes C m := Of=i — l)A,TOiA). Then: 

A d Yl SU P \\9(xm)\ 2 -\9(mA)\ 2 \^\\g\\Ue 2dAL -l). 

me{l,...,A- 1 } d Xm<£Cm 

Proof. We need multi-indices a, (3 S Ng with a! := ot\ ! • • • ay!, x a := x" 1 ■ ■ ■ x^ d , 
Cs) : ~ p<( a-(3)\ an d differential operators D a := J^2_ ■ • • Jp^r- Since \g\ 2 is real- 
analytic, a power series expansion gives for any x m £ C m : 

\\ 9 \ 2 (x m )-\g\ 2 (mA)\ = \ Y, D*\g\ 2 ( m A) {Xm -™ A)a 



< E E (^l^ 5 (mA)||£>»-^(mA)|. 



Together with 5 any derivative is again a trigonometric polynomial of degree L and 
by the isometry (|2.3|l and Bernstein's inequality, cf. (Meyer 1995, p. 32), we obtain 

A d Y, VA) = \\D a g\\l 2 < L 2 '^ 1 || 5 || 2 2 . 

roe{l,...,A- 1 } d 

This implies by the Cauchy-Schwarz inequality 



A d Y SU P ll3| 2 (* m )-|2lVA)| 

m£{l,..,A- 1 ) i I " ,eC ™ 



A d E ^ E (")(Ei^(™a)i 2 ) 1/2 (Ei^(-a)i 



< ibiii- E ^ E (fjL^L^ 

a ^ (2AL)I°I^ 

= NL» 1. — ^ — 

^||. 9 || 2 L2 ( e 2 ^-l). 



□ 
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4.7 Lemma. Let Y € M' follow the multinomial distribution with parameters n 
and pi = ■ ■ ■ = p r = 1/r. Then for n — > oo and r = r(n) wii/i rlog(r)/n — > 



max — n/r(n)\ > Cy/n\og{r{n))/r(n) 

n— >oo Vl^i^r(n) 



< 1. 



Proof. If Xi,...,X r are independently Poisson(n/r)-distributed, then it is well 
known that the law of {X\, . . . , X r ) given Y^i=i Xi — n is multinomial with param- 



eters n and pi = ■ ■ ■ = p r = l/r. Set A nr := C^/n log(r) Jr. Since 
k i ► P^ maxi^i^r Xi - n/r > A nr Y%=i = 

is obviously increasing in A; € N, we obtain 

„/ „ \ Ffmaxi^i^,.^ - n/r > A nr ) 

P maxi<j< r A, - n/r > A nr 2^. , A, = n < „/„ r — — — — r ■ 

V > p {22i=iXi>n) 

As ^2l =1 Xi is Poisson(n)-distributed, lim-r^oo P(J2i=i Xi ^ n) = 1/2 holds, 
whence 

limsup n ^ CX) ^(niaxi^r^-n/r > A nr ) -2P(maxi^ r Xi-n/r > A nr )^ < 0. 

(4.6) 

By the exponential moment estimate #[ e a ( x *- n / r )] = e n(e a -a-i)/r ^ e 3«a 2 /4r for 
a := rA nr /n — > and n large, the generalized Markov inequality yields 

P(max K ^ r ^ - n/r > 4 nr ) < rP(X, - n/r > A nr ) < re 3™ 2 /4r-aA_ = r i-c 2 /4. 

By use of l|4.6[) and a completely symmetric argument for P(maxi^j^ r (n/r — Xi) > 
A nr ), the result follows. □ 



4.8 Proposition. For j = j(n) such that jlog(j) = o(n) and the event 0™ «n (|4.2|l 

we have lmin^oo n p P((£l™^)^) = for any power p > 0. 

Proof. From Proposition ^. 61 we derive with A ^ L := \£(j)\g2, 1/A £ N, the cubes 
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C m ■— ]liLi[( m i — 1)A, 77iiA) and the occupations N m := #{i : Xi € C m }: 

n 

Nik — Elf W 

i=i 

= E (A d n| 5 (mA)| 2 - ]T |.g(^)| 2 ) 

sC- Yl (\A d n-N m \\g(mA)\ 2 +N m sup \g{mA)\ 2 - \g(x m f 



me{l,...,A~ 1 } ti 

— — : — max 

A d n me(i,..,A- 



i m ec„ 



i}d (|A d n-iY m |+7V m (e 2<iAL -l)) 



2 / „2dAL max |l-AT m /nA rf | + (e 2dAi -l) 

me{l : ...,A~ 1 } d 



By Lemma [4.71 max m |1 - N m /nA d \ 2 ^ C(nA d )- 1 log(l/A) has probability tend- 
ing to zero with any given polynomial rate when choosing C sufficiently large. 
Since L d \og(L) < jlog(j) = o(n), we can choose A = o(L _1 ) such that still 
A~ d log 2 (l/A) = o(n) holds. This gives 



\\\9\\h - \\9\\l\ < (Ce^nA*)" 1 log(l/A) + (e 2dAL 1)) || 3 || 2 2 < %\\g\\] 
for large n with probability larger than 1 — n~ p . □ 
4.9 Proposition. For j 6 N tot£/i jlog(j) = 0(71) we /iave 

E[P?-iWl£]<i/». 

Proof. By construction, H-PJLiV'jlln ^ llVjlln = 1 holds so that by Proposition ^. 81 
it suffices to find the bound for the expectation on the event f2™. 

Setting A m := i X^fcLi exp(27rt(m, -X^)), m S Z d , we use Parseval's identity 
and l?[|j4 m | 2 ] = 1/n for m ^ to obtain 

r Um; n\J 2 

E[\\P?_ 1 p j f n l n7 ]=E 
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This gives the result. 
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